Complex Loop Dynamics Underpin Activity, Specificity, and Evolvability in the (βα)8 Barrel Enzymes of Histidine and Tryptophan Biosynthesis

Enzymes are conformationally dynamic, and their dynamical properties play an important role in regulating their specificity and evolvability. In this context, substantial attention has been paid to the role of ligand-gated conformational changes in enzyme catalysis; however, such studies have focused on tremendously proficient enzymes such as triosephosphate isomerase and orotidine 5′-monophosphate decarboxylase, where the rapid (μs timescale) motion of a single loop dominates the transition between catalytically inactive and active conformations. In contrast, the (βα)8-barrels of tryptophan and histidine biosynthesis, such as the specialist isomerase enzymes HisA and TrpF, and the bifunctional isomerase PriA, are decorated by multiple long loops that undergo conformational transitions on the ms (or slower) timescale. Studying the interdependent motions of multiple slow loops, and their role in catalysis, poses a significant computational challenge. This work combines conventional and enhanced molecular dynamics simulations with empirical valence bond simulations to provide rich details of the conformational behavior of the catalytic loops in HisA, PriA, and TrpF, and the role of their plasticity in facilitating bifunctionality in PriA and evolved HisA variants. In addition, we demonstrate that, similar to other enzymes activated by ligand-gated conformational changes, loops 3 and 4 of HisA and PriA act as gripper loops, facilitating the isomerization of the large bulky substrate ProFAR, albeit now on much slower timescales. This hints at convergent evolution on these different (βα)8-barrel scaffolds. Finally, our work reemphasizes the potential of engineering loop dynamics as a tool to artificially manipulate the catalytic repertoire of TIM-barrel proteins.


■ INTRODUCTION
Enzymes are conformationally dynamic systems, and their dynamical properties are clearly connected to their biological function. 1−7 Further, conformational dynamics, in particular, that of decorating loops that cover enzyme active sites, can regulate substrate selectivity, the evolution of new activities, and potentially also turnover rates. 3,4,8−29 Because of this, enzyme conformational ensembles can, in principle, be engineered to allow enzymes to acquire new catalytic functions and/or physiochemical properties. 23,24,29−33 Therefore, there is substantial interest in understanding loop dynamics and its impact on selectivity and catalysis, to improve our ability to manipulate enzyme conformational ensembles for engineering purposes.
It is noteworthy that several of these enzymes have TIMbarrel ((βα) 8 -barrel) folds. Most if not all TIM-barrel proteins possess decorating loops, 8,12,21,42 the conformational diversity of which likely plays an important role in regulating specificity and function. 15,18 These flexible loops can vary in length, 8 but are typically used to bind substrate, and to sequester the active site from solvent by closing over the active site. 43 However, these well-characterized examples of proteins activated by ligand-gated conformational changes all focus on the roles and importance of single loops, such as gripper loop 6 in TPI. Studying the ligand-gated motion of a single loop can already pose substantial challenges; 44 systems with multiple active site loops undergoing substantial conformational changes are even more complex and therefore understudied.
As shown in Figure 1, HisA and PriA are decorated by three long catalytic loops, loops 1, 5, and 6 (or two analogous loops in the case of TrpF, which has lost most of loop 1). 15,18 Of these, loop 5 carries key residues that are important for substrate binding, loop 6 carries the catalytically important aspartic acid side chain, and position 15 of loop 1 is important for interaction with substrate PRA. 18,45,54 In the case of PriA, structural analysis 15 has suggested that bifunctionality is facilitated by transitions between a "knot-like" pro-ProFAR conformation of loop 5, with W145 pointing "in" toward the substrate ProFAR (PDB ID: 3ZS4 55 ), and a pro-PRA β-hairpin conformation of loop 5, with R143 pointing in toward the substrate PRA (PDB ID: 2Y85 15,55 ). The evolved SeHisA MtPriA and 128−139 in TmTrpF) are highlighted in light green, yellow, and dark red, respectively. Loop 1 in TmTrpF is short (four residues); thus, no corresponding loop is annotated on this panel. For clarity, N7D and A176D reversions were applied to the structure of SeHisA in complex with ProFAR (these reversions were also applied in our simulations). (D) Proposed mechanism for the Amadori rearrangement leading to the isomerization of substrates ProFAR and PRA by the different enzymes. 49 variants generated by Nasvall et al. 46 have also been the subject of detailed structural and biochemical analyses. 18 As with PriA, 15 this analysis indicated that bifunctionality is driven by competition between not just the substrates ProFAR and PRA but also between structurally distinct conformations of loops 1 and 5, in particular 18 ( Figure S1).
Although isomerization of both ProFAR and PRA proceeds through the same Amadori rearrangement (Figure 1), ProFAR is a much larger molecule containing a second phosphate group. This group forms hydrogen bonds with the side chains of R83 from loop 3 and S103 from loop 4 of SeHisA and with the corresponding side chains of R85 and T105 in MtPriA ( Figure 2). Although neither of these residues are on the primary mobile loops of either enzyme (Figure 1), these interactions are very similar to analogous interactions in other enzymes activated by ligand-gated conformational changes, 34−41 suggesting that loops 3 and 4 may similarly act as "gripper loops", allowing these enzymes to attain relevant catalytically active conformations for the isomerization of the larger substrate. This effect would not be present when the smaller substrate PRA is bound to the active site, as it lacks a nonreactive phosphodianion group to interact with these loops.
Here, we combine conventional molecular dynamics (MD), enhanced sampling, and empirical valence bond (EVB) simulations to present a comprehensive computational study of a number of wild-type and variant forms of SeHisA, MtPriA, and the Thermotoga maritima TrpF (TmTrpF). All variants studied in this work, and the corresponding structures used, are summarized in Table S1. We chose these systems because high-quality structural data were available in unliganded and ligand-bound forms. For SeHisA, we have studied the unliganded and substrate-bound forms, 54 as well as key SeHisA variants from ref 18 that were selected based on their specificity patterns (specialists vs generalists, Table S2). MtPriA and TmTrpF were similarly selected on the basis of structural data in both unliganded and product (PRFAR), or product analogue (rCdRP), bound forms, respectively (Table  S1).
Prior simulations have emphasized that even studying the motion of one large dominating conformational change is computationally nontrivial, 22,44,56 and the current systems involve the interdependent conformational rearrangements of multiple loops simultaneously. Our simulations of HisA, PriA, and TrpF (1) provide rich details of the conformational behavior of the catalytic loops in the different systems and (2) provide insight into the link between conformational dynamics, catalytic activity, and functional evolution in the different enzymes.

■ METHODOLOGY
Methodological details are presented here in brief. Full details of all simulations and any nonconventional parameters used are provided in the Supporting Information.

System Preparation for Molecular Dynamics Simulations
Simulations were performed on wild-type SeHisA, MtPriA, and TmTrpF, as well as relevant enzyme variants, in both their unliganded forms and in complex with various ligands (substrates ProFAR and PRA and, in the case of the enhanced sampling simulations, products PRFAR and CdRP). A total of 14 crystal structures were used to generate starting points for these simulations, and all simulations performed, as well as the structures, are summarized in Table S1. Where present, the D7N, D11N, and D176A substitutions were reverted to wild type using the Dunbrack 2010 Rotamer Library, 57 as implemented in UCSF Chimera, v. 1.14. 58 Missing regions in the catalytic loops were reconstructed using Modeller v. 9.23. 59 The catalytic aspartic acid side chain in the active site of each enzyme (D176 in HisA, D175 in PriA and D126 in TrpF) was kept protonated in line with the mechanism shown in Figure 1. All other residues (except H50 in PriA, which was doubly protonated) were kept in their default protonation states at physiological pH determined by the use of PROPKA 3.1, 60 and visual inspection. The substrates ProFAR and PRA were manually placed into the relevant active sites in the same conformation as found in the structure of the HisA wild-type enzyme in complex with ProFAR. In the case of PRA ( Figure  1D), the substrate was placed into the relevant active sites by manual overlay of the reactive part of PRA with the reactive part of ProFAR, and with the carboxylate group of PRA keeping key interactions with active site residues. Partial charges for ligands ProFAR, PRA, PRFAR, and CdRP were calculated using the standard restrained electrostatic potential (RESP) protocol using Antechamber v. 17.3, 61 and based on the vacuum electrostatic potential calculated at the HF/6-31G(d) level of theory, using Gaussian 09 Rev. E.01. 62 All other simulation parameters were described using the general Amber force field 2 (GAFF2) 63 (see Tables S3−S6). Finally, to  55 ), and (C) TmTrpF in complex with product analogue rCdRP (PDB ID: 1LBM 49,55 ). Highlighted are key catalytic residues, including the catalytic active site aspartic acid (D176 in SeHisA, D175 in MtPriA, and D126 in TmTrpF), the active site tryptophan that forms stacking interactions with the larger substrate ProFAR in SeHisA and MtPriA (W145 in both enzymes), as well as the gripper residues that interact with the distal phosphate group of the larger substrate/product in SeHisA and MtPriA (R83 and S103 in SeHisA and R85 and T105 in MtPriA). Key hydrogen bonding interactions are also highlighted, using the distances (Å) found in the corresponding crystal structures. For clarity, N7D and A176D reversions were applied in SeHisA in complex with substrate ProFAR (these reversions were also applied in our simulations).
keep the substrate stably bound in the enzyme active sites, weak distance restraints were applied to protein−substrate distances in the conventional molecular dynamics (MD) simulations, as described in Table S7.

Conventional Molecular Dynamics Simulations
Conventional MD simulations were performed using the CUDA version of the PMEMD module of the AMBER 16 simulation package. 64 The protein, ligands, and solvent were described using the ff14SB force field, 65 the General AMBER Force Field 2 (GAFF2), 63 and the TIP3P water model, 66 respectively. Following initial minimization and equilibration, each system (summarized in Table S1) was subjected to 10 × 500 ns of molecular dynamics simulations controlled by the Langevin thermostat with a collision frequency of 2 ps −1 , 67 and the Berendsen barostat with a 1 ps coupling constant. 68 This led to a cumulative 5 μs of production simulations per system, and a cumulative total of 70 μs of conventional MD simulations over all systems studied (Table S1).

Enhanced Sampling Molecular Dynamics Simulations
Steered molecular dynamics simulations (sMD) were performed using GROMACS 2018.4 to pull products PRFAR and CdRP out of the active site of the SeHisA-(dup13−15/D10G/G102A/Q24L) variant, as described in the Supporting Information. System preparation was performed as for the conventional MD simulations, using the same force fields and water models. Following initial minimization and equilibration, 10 × 50 ns production MD simulations were performed on each system. The first 5 ns of production MD were unrestrained, after which an external force with a force constant of 10 kcal mol −1 Å −2 was applied to pull the product out of the active site. This external force was then released for the last 5 ns of each MD simulation run.

Empirical Valence Bond Simulations
We extended our previous empirical valence bond (EVB) approach 69−73 to study the enzyme-catalyzed opening of the ribose ring of substrates ProFAR and PRA (Figure 1), as catalyzed by wild-type and variant forms of HisA, PriA, and TrpF. This is the first step of the mechanism shown in Figure  1D, and used the valence bond states shown in Figure 3. Simulations were performed on wild-type SeHisA, MtPriA, and TmTrpF as well as selected variants, as described in the Supporting Information. All simulations were performed using the Q6 simulation package, 74,75 using the all-atom optimized potentials for liquid simulations (OPLS-AA) force field. 76 All EVB parameters and methods necessary to reproduce our work can be found in the Supporting Information, with the full parameters also uploaded to Zenodo (DOI: 10.5281/ zenodo.5893598). Each system was simulated using 30 individual replicas, with each replica first equilibrated for 20 ns and the endpoint of that equilibration being used as the starting point for propagating an EVB trajectory. Each EVB free energy perturbation/umbrella sampling (EVB-FEP/US) 69 simulation was simulated using 51 individual mapping windows of 200 ps of simulation time each, leading to a total of 10.2 ns of simulation time per individual EVB trajectory. The cumulative totals were 600 ns equilibration and 306 ns EVB simulation time per individual system, and a cumulative 12 μs of equilibration and 6.12 μs of EVB simulation time over all 20 systems studied.

Analysis of Conventional and Enhanced Sampling Molecular Dynamics Simulations
Unless stated otherwise, all analysis of all conventional and enhanced sampling MD simulations was performed using CPPTRAJ. 77 Hydrogen bonds were defined as formed if the donor−acceptor distance was ≤3.0 Å and the donor− hydrogen-acceptor angle was within 180 ± 45°. Principal component analysis (PCA) was performed by first RMS fitting to a whole protein C α carbon atoms and then performing PCA on the C α carbon atoms of loops 1, 5, and 6, as well as loop 1 for HisA loop 1 elongated systems analysis. Other analyses were performed as described in the Supporting Information. SeHisA and MtPriA have similar structures, with a C α -RMSD of 1.09 Å when PDB IDs 3ZS4 55 and 5A5W 54,55 are compared. In contrast, TmTrpF is significantly smaller, with 40 fewer residues. We calculated the mean and standard deviations of the active site volumes of each unliganded enzyme during 10 × 500 ns conventional MD simulations using MDpocket, 78 obtaining average volumes of 745.4 ± 129.6, 1033.8 ± 217.0, and 1173.5 ± 158.0 Å 3 , for TrpF, PriA, and HisA, respectively (Table S8). The TrpF-active site is more compact than those of PriA and HisA, which have successively larger and more "flexible" active site pockets (using the standard deviation on the volume as a proxy for flexibility). For comparison, the substrates ProFAR and PRA have volumes of 829 and 559 Å 3 , respectively, calculated using Mol_Volume Version 1.0, with default radii of 1.7 Å and a probe sphere of 0.5 Å. This confirms the structural data indicating that the active site pocket of TrpF is too compact to accommodate the larger substrate, ProFAR, leading to the selectivity of this enzyme toward PRA. 79 Furthermore, the PriA active site is the most flexible of the three, in line with structural data, 15 indicating that PriA is capable of significantly rearranging its active site (in particular, loop 5 conformation) when accommodating the different substrates ProFAR and PriA ( Figure S1).
Two key active site side chains in HisA and PriA are important for binding ProFAR: 15,18,54,80 W145, which forms a stabilizing stacking interaction with the substrate, and R83 (R85), which interacts with the second phosphodianion group of the substrate ( Figure 2). To explore the conformational These data show that the tryptophan and arginine side chains are both highly conformationally flexible in the unliganded enzymes. However, while ProFAR binding to HisA restricts the conformational space of the arginine side chain on the gripper loop 3 to a catalytically competent position that helps stabilize the bound substrate, in PriA, the R143 side chain is only 4.8 Å from the gripper residue R85 (distance between the two side-chain carbon atoms, based on PDB ID: 3ZS4 55 ). This creates electrostatic repulsion between the two arginine side chains, destabilizing both loop 5 as well as the interaction between ProFAR and the R85 side chain ( Figure S3). In contrast, when PRA is bound to either HisA or PriA, the R83/R85 side chains increase their conformational flexibility, sampling roughly the same conformational space as in the unliganded enzyme ( Figure S2). Therefore, the interaction of these residues with the larger substrate ProFAR is likely playing an important role in the ability of these enzymes to bind and isomerize it.
The W145 side chain forms a stabilizing stacking interaction with the substrate and slightly increases its conformational flexibility when the smaller substrate PRA is bound to HisA (note that PRA was placed manually in the active site by overlay with ProFAR), although the flexibility of this side chain is still substantially reduced compared to the unliganded enzyme ( Figure 4A). However, in the case of PriA, both the W145 and R143 side chains are conformationally restricted to a catalytically competent position due to a rearrangement of loop 5 upon PRA binding that swaps the position between these two residues compared to when ProFAR is bound to PriA, preventing the electrostatic repulsion between the R85 and R143 side chains that is observed when ProFAR (PRFAR) is bound to the active site ( Figure S1). 15 As noted previously, the R83/R85 side chain is one of a number of key residues on the gripper loop that interact with the distal phosphodianion group of the larger substrate ProFAR, contributing to the stabilization of the substrate in the HisA active site. Hence, in PriA, the loop 5 rearrangement required for ProFAR binding 15 prevents R85 from gripping the second phosphodianion group, leading to a preference for the isomerization of PRA. These gripper interactions are similar to corresponding interactions in enzymes such as TIM, OMPDC, and GPDH, 34−38 where interactions with the nonreactive phosphodianion group of the substrate drive catalytically important ligand-gated conformational changes. Here, the interaction between the HisA gripper residues and the remote phosphodianion group of ProFAR appears to be similarly important for maintaining the closed conformation of loop 5, and when this interaction is lost, as in PriA, we see the corresponding opening of loop 5 ( Figure S3). This supports the likelihood that HisA and PriA are also activated by ligandgated conformational changes, albeit with more complex loop dynamics (due to the involvement of not one but three highly mobile and long catalytic loops) than in other previously characterized systems.

Conformational Dynamics of Key Catalytic Loops of HisA, PriA, and TrpF
To further explore the impact of ProFAR and PRA binding on loop dynamics, we extended our simulations to include TrpF in both its unliganded form, and in complex with PRA (Table  S1). We then performed principal component analysis (PCA) to characterize the motion of the catalytic loops during our simulations of each system, similar to prior analysis we have performed on TPI. 44 The PCA was performed on the massweighted Cartesian coordinates of each enzyme compared to the coordinates of the corresponding closed state, allowing us

JACS Au
pubs.acs.org/jacsau Article to explore the variation of the conformations of these loops in coordinate space. Figure 5 shows an overview of structural changes in the catalytic loops along the first two principal components, PC1 and PC2, as well as the minimum and maximum deviations of each of the catalytic loops from the corresponding closed reference state along each PC for each enzyme. PC1 and PC2 make contributions of 41.2, 70.9, and 51.0% and 16.4, 6.0, and 13.9% to the overall variance in each of HisA, TrpF, and PriA, respectively. Significant conformational motion is observed along both PCs; however, PC1 primarily describes loop transitions between closed and open conformational states, whereas PC2 describes conformational variation in the loops during these transitions, including transitions between different open conformations.
We subsequently projected the free energies for each enzyme along the most dominant motions, PC1 and PC2, from simulations of each of HisA, PriA, and TrpF in their unliganded forms as well as in complex with substrates ProFAR and PRA, respectively (where PRA was artificially placed in the HisA active site by manual overlay with the reactive part of ProFAR). This allowed us to compare both the free energy surfaces defined by these two PCs between the different enzymes, and the effect of ligand binding on these surfaces. The resulting combined motion of all key catalytic loops along each principal component is illustrated in Figure 6. In the unliganded form of each enzyme, the catalytic loops are highly conformationally diverse and explore a range of "wide-open" conformations ( Figure 5). This is consistent with our prior simulation studies on both TPI, 44 and the protein tyrosine phosphatases PTP1B and YopH, 22 as well as chimeric forms of these enzymes. 82 However, and consistent with structural data, ProFAR binding to HisA fully restricts the conformational sampling of all three loops ( Figure 6B). In the case of PriA, the binding of both ProFAR and PRA also stabilizes the closed conformation of the three active site loops, but still allows for some conformational flexibility in these loops ( Figure 6, based on both the topologies of the projected free energy surfaces and the corresponding energies). In sharp contrast, in the liganded form of TrpF, our MD simulations show that PRA is not stable in the active site due to the flexibility of loop 6, which explores transitions toward open conformations, similarly to the unliganded system ( Figures  5E,F and 6G,H). This in turn increases the mobility of PRA in the active site even with the restraints applied in the simulations (Table S7), allowing it to sample noncatalytic In the crystal structure of the unliganded enzyme (PDB ID: 1NSJ 55,81 ), this loop is present in a closed position but with missing density, whereas in our simulations of both the liganded and unliganded form of the enzyme, loop 6 samples open conformations, suggesting that the crystal structure does not reveal the correct closed state to stabilize the substrate PRA in the active site. We note that the structure used for these simulations (PDB ID 1LBM 49,55 ) was solved in complex with the product analogue rCdRP. Our simulations suggest that the loop conformations observed in this structure are a conformational state on the trajectory to product release, rather than an ideal conformational state for stabilizing the Michaelis complex. During our simulations of HisA and PriA, we observed the formation of a stacking interaction between ProFAR and the side chain of W145 ( Figure S4), with an average distance of 3.9 ± 0.3 Å and an angle γ = 11.7 ± 5.5°between the center of mass of the ProFAR imidazole ring and the W145 indole ring. Our results thus confirm the role for W145 that was proposed based on experimentally determined structures, 54 and is consistent with experiments indicating that HisA activity is abolished in SeHisA(dup13−15) because the extended conformation of loop 1 blocks this side chain from interacting with ProFAR. 18 In contrast, in the case of PriA, and again in agreement with prior structural analysis, 15 we observe two possible conformations of loop 5, depending on which substrate is bound to the active site. When ProFAR is bound, we sample a conformation similar to that observed in wild-type HisA ( Figure S1A) with a similar stacking interaction between ProFAR and W145 ( Figure S4). However, the loop 5 rearrangement required to optimize the stacking position of W145 with ProFAR creates transient electrostatic repulsion between loop 5 and the rest of the enzyme, making this loop more conformationally dynamic, which we observe in our analysis in the form of an increased standard deviation in the distance and angle of the corresponding stacking interaction (d = 4.1 ± 0.6 Å, γ = 19.2 ± 13.9°) (Figures S3 and S4). The greater plasticity of this interaction, in turn, decreases substrate stability in the active site (the ProFAR RMSF increases from 18.4 Å in our simulations of wild-type HisA to 22.2 Å in our simulations of wild-type PriA), and thus the corresponding ProFAR isomerization activity of PriA.
For comparison, in our simulations of PRA-bound PriA, we sampled an active site conformation in which the R143 side chain (loop 5 residue) forms salt bridges with the side chains of D130 and D175, and the arginine acts as a "shield", dampening the electrostatic repulsion between the D130 side chain and the anthranilate carboxylate group of PRA. This interaction also stabilizes the catalytic aspartic (D175), placing it in an optimal position for catalysis ( Figure S5 and Table S9), as shown in previous studies. 15 This PriA conformation is similar to the "TrpF-active" conformation observed in the SeHisA(dup13−15/D10G/Q24L/G102A) crystal structure 18 (PDB ID: 5AB3, 18,55 Figure S6, with manual placement of PRA in the active site), where the arginine is close to residue D129. We do not observe the formation of a corresponding interaction in our simulations of PRA-bound wild-type HisA, as the negative charge repulsion between PRA and the D129 side chain destabilizes the PRA position in the active site ( Figure S7), as well as the stability of loop 6 carrying the key catalytic aspartate ( Figure S7). We do, however, observe a similar interaction with the R169 side chain in the SeHisA-(L169R) variant, with interactions with D129 and, in this case, a salt-bridge interaction with the anthranilate carboxylate group of PRA ( Figure S7 and Table S9), consistent with experimental work demonstrating that the introduction of the L169R substitution in HisA induces TrpF activity (Table  S2). 46 Finally, in the case of PRA-bound TrpF, we observe a saltbridge interaction between E184 and the side chain of R36 on loop 3 ( Figure S6 and Table S9), and we can see that as for HisA and PriA, R36 is again acting as a shield avoiding possible negative repulsion interactions between the substrate and the negatively charged side chain. However, we do not observe clear interactions between the R36 side chain and the substrate PRA ( Figure S6 and Table S9, with the fraction of simulation time in which this interaction is observed being <0.1).
Overall, we observed that this arginine plays an essential role in the introduction of TrpF activity, by shielding electrostatic repulsion between negatively charged side chains and the anthranilate carboxylate group of PRA. This is in agreement with experiments where introducing an arginine or removing the negative residue introduces TrpF activity in HisA systems. 46 Conformational Dynamics of Loop 1 in HisA and PriA and Its Impact on Selectivity The importance of loops 5 and 6 of HisA and PriA for binding and catalysis is documented. 15,18,54 In contrast, the precise catalytic role of loop 1 remains unclear, although substitutions at position 15 on this loop are important for facilitating the TrpF activity of this enzyme, 18,45,54 and extending the conformation of loop 1 through duplication of residues 13− 15 (HisA(dup13−15)) plays an important role in the acquisition of bifunctionality. 18,46,83 In this context, SeHisA-(dup13−15/D10G) is a bifunctional enzyme that can catalyze the isomerization of both ProFAR and PRA with modest catalytic efficiencies, 18 Figure S8A,C, backbone RMSD of loop residues of up to ∼4.0 Å in loop 1 and ∼3.5 Å in loop 6, compared to the starting closed conformation). In contrast, CdRP is able to leave the active site without inducing large conformational changes in loop 1 (loop 1 opens in only one out of ten replicates), but pulling it out from the active site still induces a conformational change in loop 6 ( Figure S8B,D, backbone RMSD of loop residues of up to ∼2.0 Å in loop 1 and ∼3.5 Å in loop 6, compared to the starting closed conformation). One interpretation consistent with this observation is that substantial loop rearrangement is required for efficient product release, and this in turn makes it possible that a slow (rate-limiting) product release step might be the reason for the low turnover numbers observed for catalyzing an intrinsically facile reaction 84 (Table S2). Interestingly, while the elongation of loop 1 through the duplication of residues 13−15 (VVR) appears to be essential for the change of specificity toward the isomerization of PRA (facilitated by the presence of a new stabilizing arginine side chain close to the active site, Figure S1), 18 simply the loop duplication by itself is not enough to induce bifunctionality. That is, while the duplication elongates loop 1, it also rigidifies it such that SeHisA(dup13−15) does show some ability to isomerize PRA (k cat > 0.15 s −1 ), but at the expense of losing all ability to isomerize ProFAR. 18 Critical to bifunctionality is the inclusion of an additional substitution, D10G (SeHisA-(dup13−15/D10G)).
In all HisA variants we studied, when performing conventional MD simulations starting from the closed conformation of loop 1, this loop is very stable and remains closed over our simulation timescales (Table S1) Figure 8B, wide-open conformation). More specifically, as can be seen in Figure 8, the inclusion of the D10G substitution alone is enough to increase the conformational space sampled by loop 1 in SeHisA(D10G). The dup13−15 elongation rigidifies the whole loop compared to the wild type, and adding the D10G variant on the background of the elongation (SeHisA(dup13−15/D10G)) once again expands the conformational space sampled by the loop.
Our conventional MD simulations (500 ns per trajectory) are short, taking into account the slow turnover numbers of these enzymes (that suggest loop motions on the ms to s timescale). 18 However, our observation that SeHisA(D10G) has an expanded conformational space sampled by loop 1, compared to SeHisA(dup13−15/D10G), agrees with prior NMR relaxation dispersion experiments. 18 These detected μs to ms motions at 14 backbone 15 N positions in the SeHisA(dup13−15/D10G) variant, compared to only three positions for the SeHisA(dup13−15) variant, and with two resonances that are unique to SeHisA(dup13−15/D10G). As a result, adding this substitution is sufficient to convert SeHisA(dup13−15/D10G) back to a bifunctional enzyme, through the exploitation of conformational dynamics, with k cat of 0.09 s −1 for the isomerization of PRA, and 0.05 s −1 for the isomerization of ProFAR (Table S2). 18 To explore this further, we examined simulations of four loop-elongated variants, specifically: dup13−15 (PDB ID: 5G2I 18 (Table S2).
Our simulations revealed how all of these enzymes are able to populate closed, open, and wide-open conformations (Table  1). However, the relative populations of these states depend strongly on enzyme variant: already, the D10G substitution by itself appears to be sufficient to cause a conformational shift toward a closed conformation compared to the distribution observed in the wild-type enzyme, and the population of the wide-open conformation is further reduced in the SeHisA-       For EVB simulation details, see the Methodology section. Structures were selected based on clustering analysis using the hierarchical agglomerative algorithm, as implemented in CPPTRAJ. 77 Note that the annotated catalytic distances are average values over 6000 snapshots extracted for each state from our EVB trajectories (from 30× individual 200 ps EVB mapping windows per stationary point/system). For a full list of reacting distances across all variants, see Tables S12 and S13. To further probe the role of loop 1 in HisA and PriA, we performed empirical valence bond (EVB) 69 simulations of the initial ribose ring opening of substrates ProFAR and PRA (Figure 1), as catalyzed by wild-type and variant forms of the two enzymes. The calculated activation free energies for this step are not trivial to directly compare with the experimental turnover numbers; unfortunately, no kinetic data exist for the rates of the individual chemical steps. Furthermore, the observed k cat values for these systems are extremely lowon the order of 1 s −1 (or lower) for both the HisA and TrpF reactions catalyzed by HisA and its variants. 18 However, in the case of Escherichia coli TrpF, which has a higher turnover number than any of the enzymes studied here (k cat of 30−40 s −1 , see Table S2 for comparison), the rate-limiting step is an uncatalyzed keto-enol tautomerization that occurs after the ring-opening step (Figure 1). 85 When also taking into account the potential involvement of loop dynamics in determining turnover rates, as seen in other enzymes with catalytically important conformational changes such as protein tyrosine phosphatases 16,19,22,82 and indole-3glycerol phosphate synthase, 86 this means the experimental turnover numbers for the isomerization of ProFAR and PRA by the enzymes of interest do not correspond to a chemical step occurring in the enzyme active site. Furthermore, the Amadori reaction that occurs between the enzyme-catalyzed ring-opening reaction and the nonenzymatic tautomerization is likely to be fast (the uncatalyzed reaction occurs spontaneously at 25°C). 84 Thus, the ring-opening reaction is likely the slowest enzyme-catalyzed chemical step in the catalytic cycle, as supported by QM/MM studies of the HisA-catalyzed reaction. 83 However, even if the experimentally measured turnover numbers (k cat ) do not directly correspond to this step, they do produce a lower limit for the rate of this step (and thus an upper limit for the corresponding activation free energy for the rate-limiting step of the enzymatic reaction). Thus, comparing the calculated activation free energies for the ring opening in different variants, and in different conformational states of loop 1, can provide insight into the impact of loop dynamics and key amino acid substitutions on the rate of the slowest enzyme-catalyzed step.
Taking these limitations into account, the experimental and calculated activation energies are shown in Tables S10 and S11 and Figure 9, with representative Michaelis complexes (MC), transition states (TS), and ring-open intermediates for the ribose ring-opening reaction catalyzed by each wild-type enzyme shown in Figure 10 (ProFAR) and Figure S10 (PRA). We note that PRA has much more conformational freedom in the HisA and PriA active sites, which are both evolved to (also) accommodate the larger substrate ProFAR, and is thus capable of sampling a broad number of different conformations at the Michaelis complex, leading to an increased standard error of the mean in comparison to systems where ProFAR is bound to the active site of these enzymes. Curiously, our calculations reproduce the experimental trend in activation free energies derived from the turnover numbers relatively well for both substrates ProFAR (Table S10) and PRA (Table S11) for nearly all variants, despite the fact that observed k cat does not correspond to a chemical step in the enzyme, as discussed above.
We used MtPriA in its pro-ProFAR conformation as a reference state to calibrate our EVB simulations of the initial ring-opening of ProFAR, obtaining an activation free energy of 18.3 ± 0.5 kcal mol −1 for this reaction. The resulting EVB parameters were then used unchanged in all relevant systems, to obtain the data presented in Table S10. We obtained a slightly lower activation free energy of 17.5 ± 0.6 kcal mol −1 for the analogous reaction catalyzed by wild-type HisA, which is in agreement with the experimental observation that SeHisA is a better catalyst of ProFAR isomerization than MtPriA (Table S10). To further validate our results, we performed single amino acid substitutions in each enzyme (R19A and D130A in PriA and S202A, D129N and D10G in HisA), and performed EVB simulations on these variants. In nearly all cases, we observed similar or higher activation free energies for the ring-opening step compared to the respective wild-type enzymes, in agreement with experimentally observed loss of activity upon these substitutions. 15,18 We note, however, that this effect is less pronounced than the experiment in the case of the HisA(S202A) and PriA(D130A) variants, suggesting that the experimentally observed loss of activity is due to either a change in substrate positioning or loop conformation or dynamics, which we are not necessarily able to capture in our simulations when simply starting from the wild-type crystal structure and manually truncating these residues.
In the case of PRA (Table S11), we again used the reaction catalyzed by MtPriA as our EVB reference state, this time with loop 5 in its pro-PRA conformation. While wild-type SeHisA does not show TrpF activity, 18 we nevertheless modeled this reaction in SeHisA for comparison, starting from an idealized position of PRA in the active site (based on overlay with substrate ProFAR), and obtained an activation free energy of 19.5 ± 0.6 kcal mol −1 . This is higher than the value obtained for ProFAR ring-opening by SeHisA (17.5 ± 0.6 kcal mol −1 , Table S10), but is very similar to the value obtained for SeHisA(dup13−15) (19.9 ± 1.2 kcal mol −1 , Table S11). This suggests that, in theory, SeHisA could catalyze PRA ringopening if all loops are in the correct conformation and the substrate is optimally positioned, and that the lack of activity is not because of a high barrier to the chemical reaction catalyzed by this enzyme. All subsequent variants show a steady decrease in calculated activation free energies for the ring-opening reaction catalyzed from the loop closed conformation, in line with the increased population of the loop 1 closed conformation (Table 1) compensating for the lack of interactions between the substrate PRA and the stabilizing gripper loop 3 ( Figure 3).
In the case of MtPriA we modeled three individual substitutions (R19A, D130A, and R143A) and extended our EVB simulations (Table S11 and Figure 9), to specifically capture the impact of losing the electrostatic contribution of each truncated side chain on the activation free energy. In the case of MtPriA(R19A), we obtained a slight increase in activation free energy that trends with the loss of activity observed experimentally. 15 Structurally, this side chain is located on loop 1, and has been suggested to be important for the substrate-specific formation of the active site into pro-ProFAR and pro-PRA conformations. 15 Our EVB calculations on both ProFAR and PRA ring-opening as catalyzed by MtPRA(R19A) (Tables S10 and S11) suggest an important electrostatic role for this residue. Similarly, kinetic and structural data suggest that truncation of R143 and D130 to alanine negatively impacts PRA isomerization (in the case of JACS Au pubs.acs.org/jacsau Article R143A, k cat /K M is reduced from 1.7 × 10 5 to 6.0 × 10 3 M −1 s −1 on the introduction of this substitution). 15 In contrast, we obtain lower activation free energies for PRA ring-opening catalyzed by both the D130A and R143A variants. However, especially in the case of the R143A variant, it is unclear if the experimentally observed effect is expressed on k cat , K M , or both, and it is plausible that the loss of activity is due to structural effects that prohibit productive substrate binding, and therefore are not captured in our simulations. This scenario would be similar to our observations from an analogous system activated by a ligand-gated conformational change, glycerol-3phosphate dehydrogenase (GPDH). 38 Finally, to examine how important loop 1 closure is for PRA isomerization, we performed EVB simulations of ProFAR and PRA ring-opening as catalyzed by SeHisA (both wild-type and the SeHisA(dup13−15/D10G/G102A/Q24L/V15[b]M variant)) and MtPriA (wild-type) with loop 1 in the open conformation (Tables S10 and S11). Due to the lack of crystal structures, we extracted structures from our conventional MD simulations as starting points for these EVB simulations. For PRA isomerization, we obtained much higher activation free energies for ring-opening when loop 1 was modeled in an open conformation than a closed conformation, due to a combination of the loss of key interactions between loop 1 and the substrate, and also extra solvent exposure of the active site when this loop opens. However, we observed no impact on the activation free energy when modeling ProFAR isomerization starting with loop 1 open. Therefore, while the catalytic importance of loops 5 and 6 is well established, 15,18,54 our EVB calculations clearly show that, in addition, full closure of loop 1 into a catalytically competent conformation is essential for efficient PRA isomerization.

■ OVERVIEW AND CONCLUSIONS
In the present work, we combined conventional and enhanced sampling MD simulations, with EVB calculations, to explore the role of loop dynamics in dictating the selectivity and evolvability of the evolutionarily important model enzymes, HisA, TrpF, and PriA. 18,[45][46][47][48]87 The roles of loop dynamics and ligand-gated conformational changes in TIM-barrel proteins and proteins from other folds have been a topic of substantial research interest (e.g., refs 8, 34−38, 88−97). However, what makes the current enzymes stand out from these prior studies is the importance of not one but two (TrpF) or even three (HisA and PriA) long, mobile loops (Figure 1), the specific conformations of which have been suggested to play an important role in facilitating the substrate selectivity of PriA and evolved HisA variants. 15,18,54 Thus, these enzymes undergo particularly complex loop dynamics. In addition, prior enzymes that have been characterized as being activated by ligand-gated conformational changes, such as TPI and OMPDC, are extremely proficient. 34−38,98−100 In contrast, the enzymes studied here are relatively inefficient (Table  S2), 15,18,79 with turnover numbers of ∼10 s −1 or less, 18,52,101 despite catalyzing an intrinsically facile reaction. 84 Furthermore, while loop motions in TIM-barrel enzymes such as TPI are relatively fast (on the μs timescale), 96 motions on the ms timescale have been detected in the evolved HisA variants, 18 and thus loop motions are likely to be (at least partially) ratelimiting in these enzymes.
At the simplest level, our simulations agree with structural data 15 that TrpF, PriA, and HisA have increasingly large (in terms of active site volume) and "breathable" active sites (Table S8), allowing for the accommodation of ProFAR by HisA and PriA but not PRA-specific TrpF, the active site of which is too small to accommodate the larger substrate. 15 More importantly, HisA and PriA both possess "gripper residues" interacting with the nonreactive phosphodianion group of ProFAR (R83 and S103 in HisA; R85 and T105 in PriA, Figure 2), that are very similar to analogous interactions in other enzymes activated by ligand-gated conformational changes, such as TPI. 34,35 Of note, however, is that the HisA/ PriA gripper residues are contributed from the less mobile loops 3 and 4, unlike the primary gripper loop in TPI (loop 6), which undergoes a substantial conformational change upon ligand binding. 12 Other TIM-barrel proteins such as OMPDC possess analogous gripper loops to TPI loop 6, 12 showing evidence for convergent evolution on these different (βα) 8barrel scaffolds.
Our simulations show that while the gripper interaction is stable in HisA throughout the simulations, in PriA, there is electrostatic repulsion between R85 and an additional active site arginine, R143, which causes instabilities in the catalytic loops (in particular loop 5, Figure S3), as well as in the substrate positioning in the active site such that ProFAR is less stable in the PriA active site than in the HisA active site. This is in effect a ligand-gated effect, where interaction with the nonreactive phosphodianion (which is not present in the smaller substrate, PRA) facilitates the stability of catalytically important loop 5. Thus, the underlying principles driving loop stability are similar to those of other enzymes that are activated by ligand-gated changes.
PCA on our simulations showed that the active site loops in these enzymes can sample a range of wide-open conformations with transitions between them, with their conformational flexibility being stabilized by ligand binding (although less so in the bifunctional MtPriA than the ProFAR-specific SeHisA). In contrast, the active site loops in TmTrpF, which binds a smaller substrate and lacks the gripper, remain dynamic. In particular, loop 6 ( Figure 8) samples a range of open conformations even with PRA bound to the active site. Additionally, all HisA variants from the real-time evolution experiment 46 studied here also sample a range of open and wide-open conformations. In this context, however, the D10G substitution on loop 1 appears to be sufficient by itself to increase the population of the closed conformation sampled during our simulations (Table 1), and the highest proportion of closed conformation is observed in simulations of the SeHisA(dup13−15/D10G/Q102A/Q24L/V15[b]M) variant, which has the highest TrpF activity 18 (Table S2).
Pulling simulations, where we pulled products PRFAR and CdRP out of the SeHisA(dup13−15/D10G/G102A/Q24L) active site, showed significant conformational changes in both loops 1 and 6 when pulling PRFAR out, whereas CdRP was pulled out with loop 1 remaining stable and loop 6 opening. This suggests that loop 1 dynamics are more important for binding of ProFAR and subsequent release of PRFAR, than for the smaller substrate PRA and its product CdRP, whereas loop 1 dynamics appears to be critical to catalysis (Table S11). In addition, as mentioned above, these enzymes are relatively inefficient (Table S2). 18,52,101 When taking this into account, the substantial rearrangements that we observe for both compounds suggests that a slow (potentially rate-limiting) product release step that requires substantial loop rearrangement may be the reason for the low turnover numbers observed for an otherwise facile reaction, further providing JACS Au pubs.acs.org/jacsau Article evidence that turnover rates are being regulated by loop dynamics.
In contrast to HisA, which undergoes conformational changes of loop 1 during the real-time evolution experiment that changes its selectivity from ProFAR-specific to PRAspecific, 18,46 the bifunctional PriA is already able to rearrange its active site in its wild-type form, to accommodate the different substrates through alternation between pro-ProFAR and pro-PRA conformations of loop 5. 15 These conformational changes both reduce repulsion between the two active site arginine side chains that in turn destabilize loop 5 dynamics ( Figure S3), as well as disrupting the stacking interaction between the W145 side chain and the substrate ProFAR ( Figure S4). This rationalizes the preference of this enzyme toward PRA rather than ProFAR, despite its similarities with the HisA active site. We also note that loss of the stacking interaction between W145 and ProFAR was one aspect of the gain of PRA isomerization activity in the SeHisA(dup13−15) variant from the real-time evolution experiment. 18,46 Finally, we performed EVB simulations of the first ringopening step of ProFAR and (where relevant) PRA isomerization ( Figure 1) by wild-type HisA, PriA, and variants. As described above, the actual rate of the chemical step in these enzymes is unknown, since the rate-limiting step in, at least tryptophan biosynthesis, appears to be an uncatalyzed ketoenol tautomerization step. 85 Nevertheless, we saw that our calculated activation free energies for the ring-opening reaction trend well with the differences in activation free energies derived using the measured turnover numbers as an upper limit for this value, suggesting there is both a chemical and a dynamical component to the observed changes in activity upon substitution and/or duplication of key residues. Furthermore, to reproduce the relevant PriA activation free energies, it was necessary to start from different structures of loop 5, following earlier structural analysis that demonstrated the loop can exist in either a knot-like pro-PRA or β-hairpin pro-ProFAR conformation ( Figure S1), depending on which substrate is bound. 15 Clearly, the ease with which this rearrangement can occur will also impact the selectivity of this enzyme. Finally, EVB simulations of wild-type SeHisA, MtPriA and the SeHisA(dup13−15/D10G/G102A/Q24L/V15[b]M) variant with loop 1 in an open conformation all yielded substantially higher energies for PRA isomerization, whereas ProFAR isomerization (by wild-type HisA) seems to be unaffected (Tables S10 and S11). This further emphasizes the importance of the correct closure of loop 1 for isomerization of the smaller substrate PRA. This is in contrast to substrate binding, where the conformational plasticity of loop 1 appears to be far more important for facilitating the correct binding of ProFAR than PRA ( Figure S8).
Taken together, these observations highlight the critical role of multiple decorating loops of HisA, PriA, and TrpF in facilitating catalysis. These enzymes stand out from prior systems that have been demonstrated to be activated by ligandgated conformational changes 34−41 due to a number of factors. First, we have shown the interdependent motion of three long loops (or two in TrpF), none of which dominates and each of which is capable of undergoing substantial conformational changes to facilitate the turnover of different substrates. Second, unlike prior systems which show substantial rate accelerations compared to the uncatalyzed reactions with comparatively rapid loop motions, in these enzymes, the catalyzed reaction is already intrinsically fast, 84 whereas loop motion is slow and appears to be controlling the reaction rate.
It has been argued that a PriA-like gene product could have been the common evolutionary ancestor for both HisA and TrpF. 53, 87,102 Ancestral sequence reconstruction has also been used to suggest that ancient HisA precursors were likely bifunctional and that this bifunctionality persisted over at least a 2 billion year time span. 52 However, as shown in Figure 5, HisA and PriA exploit loops 1, 5, and 6 to facilitate activity, whereas TrpF lacks an analogue for loop 1, and isomerizes PRA harnessing just two catalytic loops 3 and 6. Our results suggest that an evolutionary trajectory from a PriA-like ancestor to an extant TrpF would be surprisingly complex. Loop 1 would need to be truncated (and not extended, as when SeHisA was artificially evolved into a TrpF), 18 and the interdependency of this third loop would need to be lost, raising the question of what evolutionary path would take a PriA-like precursor to TrpF, while completely abolishing this loop. We note that the natural or evolved bifunctionality in these enzymes being driven by the ability to adapt different conformational states is consistent with prior studies on the laboratory evolved bifunctionality of a phosphotriesterase from Pseudomonas diminuta, 30,103 suggesting that it may be a more common phenomenon not limited to these enzymes.
Despite the novel aspects of the systems studied here, a key similarity with prior systems is the generalizability of ligandgated conformational changes across a wide range of systems, in particular TIM-barrel proteins, 34 which tend to possess flexible loops decorating their active sites. The conservation of such ligand-gated conformational changesalbeit triggered in different loopssuggests that these loops evolve independently of the barrel providing a starting point for the emergence and divergence of new enzyme activities. 21,47 In addition, TrpF, for example, has been shown to be highly tolerant of variations in loop 6 sequence such that grafting sequences from related enzymes such as TrpA, HisA, and PriA onto the TrpF scaffold did not abolish activity. 104 This is significant considering the high evolvability of this scaffold, 8 and the wide range of chemistry it supports, 47 which makes it very desirable as a starting point for protein engineering efforts. In addition, it could be argued that the real-time evolution experiment that bestowed PRA isomerization activity to HisA 46 effectively performed "natural" loop-engineering by altering the conformations of key active site loops. 18 Our work therefore suggests that, more broadly, loop grafting and engineering is a powerful tool for generating novel enzymes with tailored activities and specificities, even in complex systems with multiple highly mobile and interdependent catalytic loops.